Search | WHO COVID-19 Research Database

Lip reading as an active mode of interaction with computer systems

Pandey, Laxmi.

Dissertation Abstracts International: Section B: The Sciences and Engineering ; 84(6-B):No Pagination Specified, 2023.

Article in English | APA PsycInfo | ID: covidwho-2301457

ABSTRACT

Interacting with computer systems with speech is more natural than conventional interaction methods. It is also more accessible since it does not require precise selection of small targets or rely entirely on visual elements like virtual keys and buttons. Speech also enables contactless interaction, which is of particular interest when touching public devices is to be avoided, such as the recent COVID-19 pandemic situation. However, speech is unreliable in noisy places and can compromise users' privacy and security when in public. Image-based silent speech, which primarily converts tongue and lip movements into text, can mitigate many of these challenges. Since it does not rely on acoustic features, users can silently speak without vocalizing the words. It has also been demonstrated as a promising input method on mobile devices and has been explored for a variety of audiences and contexts where the acoustic signal is unavailable (e.g., people with speech disorders) or unreliable (e.g., noisy environment). Though the method shows promise, very little is known about peoples' perceptions regarding using it, their anticipated performance of silent speech input, and their approach to avoiding potential misrecognition errors. Besides, existing silent speech recognition models are slow and error prone, or use stationary, external devices that are not scalable. In this dissertation, we attempt to address these issues. Towards this, we first conduct a user study to explore users' attitudes towards silent speech with a particular focus on social acceptance. Results show that people perceive silent speech as more socially acceptable than speech input but are concerned about input recognition, privacy, and security issues. We then conduct a second study examining users' error tolerance with speech and silent speech input methods. Results reveal that users are willing to tolerate more errors with silent speech input than speech input as it offers a higher degree of privacy and security. We conduct another study to identify a suitable method for providing real-time feedback on silent speech input. Results show that users find an feedback method effective and significantly more private and secure than a commonly used video feedback method. In light of these findings, which establish silent speech as an acceptable and desirable mode of interaction, we take a step forward to address the technological limitations of existing image-based silent speech recognition models to make them more usable and reliable on computer systems. Towards this, first, we develop LipType, an optimized version of LipNet for improved speed and accuracy. We then develop an independent repair model that processes video input for poor lighting conditions, when applicable, and corrects potential errors in output for increased accuracy. We then test this model with LipType and other speech and silent speech recognizers to demonstrate its effectiveness. In an evaluation, the model reduced word error rate by 57% compared to the state-of-the-art without compromising the overall computation time. However, we identify that the model is still susceptible to failure due to the variability of user characteristics. A person's speaking rate, for instance, is a fundamental user characteristic that can influence speech recognition performance due to the variation in acoustic properties of human speech production. We formally investigate the effects of speaking rate on silent speech recognition. Results revealed that native users speak about 8% faster than non-native users, but both groups slow down at comparable rates (34-40%) when interacting with silent speech, mostly to increase its accuracy rates. A follow-up experiment confirms that slowing down does improve the accuracy of silent speech recognition. (PsycInfo Database Record (c) 2023 APA, all rights reserved)

Towards MOOCs for Lipreading: Using Synthetic Talking Heads to Train Humans in Lipreading at Scale

Agarwal, A.; Sen, B.; Mukhopadhyay, R.; Namboodiri, V.; Jawahar, C. V..

23rd IEEE/CVF Winter Conference on Applications of Computer Vision, WACV 2023 ; : 2216-2225, 2023.

Article in English | Scopus | ID: covidwho-2248160

ABSTRACT

Many people with some form of hearing loss consider lipreading as their primary mode of day-to-day communication. However, finding resources to learn or improve one's lipreading skills can be challenging. This is further exacerbated in the COVID19 pandemic due to restrictions on direct interactions with peers and speech therapists. Today, online MOOCs platforms like Coursera and Udemy have become the most effective form of training for many types of skill development. However, online lipreading resources are scarce as creating such resources is an extensive process needing months of manual effort to record hired ac-tors. Because of the manual pipeline, such platforms are also limited in vocabulary, supported languages, accents, and speakers and have a high usage cost. In this work, we investigate the possibility of replacing real human talking videos with synthetically generated videos. Synthetic data can easily incorporate larger vocabularies, variations in accent, and even local languages and many speakers. We propose an end-to-end automated pipeline to develop such a platform using state-of-the-art talking head video generator networks, text-to-speech models, and computer vision techniques. We then perform an extensive human evaluation using carefully thought out lipreading exercises to validate the quality of our designed platform against the existing lipreading platforms. Our studies concretely point toward the potential of our approach in developing a large-scale lipreading MOOC platform that can impact millions of people with hearing loss. © 2023 IEEE.

Communication during the COVID-19 pandemic: the hearing-impaired perspective.

Pinsonnault-Skvarenina, Alexis; Hotton, Mathieu; Sharp, Andréanne; Chauvette, Loonan; Tremblay, Élodie; Choquette, Ronald; Ansaldo, Ana Ines; Gagné, Jean-Pierre; Lacerda, Adriana Bender Moreira.

Int J Audiol ; : 1-11, 2022 Sep 21.

Article in English | MEDLINE | ID: covidwho-2037258

ABSTRACT

OBJECTIVE: To understand the communicational and psychosocial effects of COVID-19 protective measures in real-life everyday communication settings. DESIGN: An online survey consisting of close-set and open-ended questions aimed to describe the communication difficulties experienced in different communication activities (in-person and telecommunication) during the COVID-19 pandemic. STUDY SAMPLE: 172 individuals with hearing loss and 130 who reported not having a hearing loss completed the study. They were recruited through social media, private audiology clinics, hospitals and monthly newsletters sent by the non-profit organisation "Audition Quebec." RESULTS: Face masks were the most problematic protective measure for communication in 75-90% of participants. For all in-person communication activities, participants with hearing loss reported significantly more impact on communication than participants with normal hearing. They also exhibited more activity limitations and negative emotions associated with communication difficulties. CONCLUSION: These results suggest that, in times of pandemic, individuals with hearing loss are more likely to exhibit communication breakdowns in their everyday activities. This may lead to social isolation and have a deleterious effect on their mental health. When interacting with individuals with hearing loss, communication strategies to optimise speech understanding should be used.

Impacts of face coverings on communication: an indirect impact of COVID-19.

Saunders, Gabrielle H; Jackson, Iain R; Visram, Anisa S.

Int J Audiol ; 60(7): 495-506, 2021 07.

Article in English | MEDLINE | ID: covidwho-947618

ABSTRACT

OBJECTIVE: To understand the impact of face coverings on hearing and communication. DESIGN: An online survey consisting of closed-set and open-ended questions distributed within the UK to gain insights into experiences of interactions involving face coverings, and of the impact of face coverings on communication. SAMPLE: Four hundred and sixty members of the general public were recruited via snowball sampling. People with hearing loss were intentionally oversampled to more thoroughly assess the effect of face coverings in this group. RESULTS: With few exceptions, participants reported that face coverings negatively impacted hearing, understanding, engagement, and feelings of connection with the speaker. Impacts were greatest when communicating in medical situations. People with hearing loss were significantly more impacted than those without hearing loss. Face coverings impacted communication content, interpersonal connectedness, and willingness to engage in conversation; they increased anxiety and stress, and made communication fatiguing, frustrating and embarrassing - both as a speaker wearing a face covering, and when listening to someone else who is wearing one. CONCLUSIONS: Face coverings have far-reaching impacts on communication for everyone, but especially for people with hearing loss. These findings illustrate the need for communication-friendly face-coverings, and emphasise the need to be communication-aware when wearing a face covering.

Subject(s)

Auditory Perception , COVID-19/prevention & control , Communication Barriers , Hearing Disorders/psychology , Lipreading , Masks , Persons With Hearing Impairments/psychology , COVID-19/transmission , Cues , Facial Expression , Hearing , Hearing Disorders/diagnosis , Hearing Disorders/physiopathology , Humans , Social Behavior , Visual Perception

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL